CT-FC: more Comprehensive Traversal Focused Crawler
نویسندگان
چکیده
منابع مشابه
YAFC: Yet Another Focused Crawler
As the Web continues to grow rapidly, focused topic-specific Web crawlers will gain popularity over traditional general-purpose search engines for locating, indexing and keeping up to date information on the Web. This paper presents YAFC (Yet Another Focused Crawler), a neurodynamic programming approach to focused crawling. YAFC combines TD(λ) reinforcement learning with a neural network to lea...
متن کاملAn Ontology-Based Focused Crawler
In this paper we present a novel approach for building a focused crawler. The goal of our crawler is to effectively identify web pages that relate to a set of predefined topics and download them regardless of their web topology or connectivity with other popular pages on the web. The main challenges that we address in our study concern the following. First we need to be able to effectively iden...
متن کاملRanking Hyperlinks Approach for Focused Web Crawler
The World Wide Web is growing rapidly and many search engines do not cover all the visible pages. Therefore, a more effective crawling method is required to collect more accurate data. In this paper, we introduce an effective focused web crawler containing smart methods. In text analysis, similarity measurement applies to different parts of the Web pages including title, body, anchor text and U...
متن کاملTowards a Keyword-Focused Web Crawler
This paper concerns predicting the content of textual web documents based on features extracted from web pages that link to them. It may be applied in an intelligent, keyword-focused web crawler. The experiments made on publicly available real data obtained from Open Directory Project with the use of several classification models are promising and indicate potential usefulness of the studied ap...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: TELKOMNIKA (Telecommunication, Computing, Electronics and Control)
سال: 2012
ISSN: 2302-9293,1693-6930
DOI: 10.12928/telkomnika.v10i1.78